Context

The spatial occupancy model was refit to the 2017 wolverine survey data. This was done for two primary reasons:

  1. There were some organizational changes needed to data input files to make direct comparisons from wolverine occupancy model results from the 2017 and 2022 surveys. For details, see Motivation for and changes to data.

  2. There were two concerning elements in the fitted spatial occupancy model using the 2017 survey and subsequently used in the first wolverine manuscript (Lukacs et al. 2020).

2017 Survey

Summary Information

  • The number of sampled cells: 188
  • The number of cells in the sampling frame: 633
  • The total number of cells with wolverine detections: 56

Spatial Modeling (original)

The 2017 wolverine detection / non-detection data were fit using a Bayesian spatial occupancy model (Johnson et al. 2013) via the stocc R package.

The original spatial occupancy model fit to these data and used in the first wolverine manuscript (Lukacs et al. 2020) had two concerning issues.

First, the Markov-chain of the occupancy intercept (on probit-scale) parameter showed high autocorrelation and general lack of convergence (i.e, inference from the posterior summaries are unreliable).

The plot below shows the Markov chain Monte carlo iterations sequence (left) and the same values as the probability density of the posterior distribution (right). For appropriate inference from Bayesian models, we need the posterior distributions to be converged. This plot depics a parmaeter that is not converged and thus making inference from it is suspect.

I believe the cause of this lack of convergence is because the model was fit as an intrinsic conditional autoregressive (ICAR) model.

The second issue was that the spatial variance parameter posterior distribution is centered around an extremely large value. This parameter should generally be quite small (e.g., 0.1). Note that this parameter does look converged, in that the sequence of the Markov chain Monte carlo iterations look stable (not wandering) and looks grassy or like a hairy caterpillar. That is not the issue here. It is more that the values (see y-axis of plot on left or x-axis on plot on right) are too high to be plausible.

I believe the cause of this is due to the prior probability distribution put on the spatial variance parameter.

This is concerning because the occupancy probability for each site is altered by a realization from a Normal distribution with this parameter. The estimated site-level deviations should be generally small, but in this case they are very large. Below is a histogram of all the site-level deviations (on probit scale) from this parameter.

Spatial Modeling (new)

To refit this data, instead of using ICAR, I used restricted spatial regression (RSR) which approximates ICAR and has a higher reliability in model convergence because it reduces spatial confounding between the spatial process and covariate effects; in this case that means the occupancy intercept and occupancy coefficient for the habitat covariate.

Here is a plot of the occupancy intercept; it is well converged.

Here is a plot of the spatial variance parameter; it is well converged and at a realistic value.

This parameter leads to more realistic site-level deviations on the probit scale.

For completeness, the other parameters being estimated in the spatial occupancy model are the detection probability (on probit scale) and the occupancy coefficient for the Copeland_Inman habitat variable. They are also well converged.

The median and 95% CI’s of the occupancy slope coefficient for the Copeland_Inman habitat variables are

##       2.5%        50%      97.5% 
## -1.3006781  0.8056728  3.0085300

The probability this effect is positive is 0.77

Prediction Plots

Original Model

The below figure reproduces the exact results displayed in Lukacs et al. 2020 Figure 4a.

The spatial threshold setup in the ICAR model was 20,000 m, which means that only the neighboring cells should affect each other via this spatial process. However, we don’t see noticeable spatial variation within each block of cells; for example all the cells in greater Yellowstone ecosystem are the same color (similar occupancy values), regardless of whether there were detections or not. This is due to the spatial variance parameter being estimated much too high.

New Model

In the refit model, we can see more spatial variation in occurrence by neighboring cells. Notice there are hot spots of occurrence surroundings cells with detections and cold spots where there is a collection of non-detections. This is because the spatial variation is being more appropriately estimated.

What we are seeing is the influence of the spatial process in the model. How far this spatial effect extends is really important. The model in the 2020 manuscript and this model use a threshold of effect of 20 km, such that at most one cell is influencing the four most immediate and nearest cells. If this was not previously discussed, it should be so now. Does 20 km seem reasonable given Wolverine movements and the size of the cell? Note that the diagonal cells from a center cell are not being influenced at this distance.

New Model with State Boundaries

New Model Overall Occurence

New Model State Occupancy

Here are the posterior distributions of the number of occupied sites by state (left plot) and the overall probability of occupancy by state (right plot).

Summary Results Comparison

Model State Number.Cells Median.Occupied.cells
Manuscript ID 189 87
Updated ID 199 88
Manuscript MT 194 117
Updated MT 188 90
Manuscript WA 93 40
Updated WA 93 49
Manuscript WY 157 24
Updated WY 153 32

NOTE that the number of cells sampled has changed (see, Motivation for and changes to data). Because of this and because the model is refit with the Copeland_Inman habitat covariate (rather than just the Copeland), and because the model is now well fit, the predicted median number of occupied cells by state has changed.

Changes from the first manuscript

See details in Motivation for and changes to data

Data Changes

  • Counts of cells by state. The first manuscript used field definitions of what state was assigned to a site. Now using GIS definitions (centroid of cell).

  • Habitat covariate changed from Copeland to Copeland_Inman.

Model Changes

  • ICAR changed to RSR. Improves model convergence

  • More diffuse prior probability distribution on spatial variance parameter. Allows spatial process variance to work as intended.

References

Lukacs, P.M., Evans Mack, D., Inman, R., Gude, J.A., Ivan, J.S., Lanka, R.P., Lewis, J.C., Long, R.A., Sallabanks, R., Walker, Z., Courville, S., Jackson, S., Kahn, R., Schwartz, M.K., Torbit, S.C., Waller, J.S. and Carroll, K. (2020), Wolverine Occupancy, Spatial Distribution, and Monitoring Design. Journal of Wildife Management, 84: 841-851. https://doi.org/10.1002/jwmg.21856

Johnson, D. S., Conn, P. B., Hooten, M. B., Ray, J. C., & Pond, B. A. (2013). Spatial occupancy models for large data sets. Ecology, 94, 801-808.